Performance Analysis of Enhanced Fine–grain Multithreaded Distributed–memory Systems
نویسنده
چکیده
In fine–grain multithreading, the thread changes in each processor cycle, consecutive instructions are thus issued from different threads, and no data dependencies stall the pipeline. Enhanced fine–grain multithreading maintains a number of additional threads which are used to replace an active thread when it initiates a long–latency operation. Performance improvements due to enhanced multithreading are studied by analyzing a timed Petri net model of a fine–grain multithreaded architecture at the instruction execution level.
منابع مشابه
W.m. Zuberek: Performance of Fine-grain Multithreaded Multiprocessors Performance Analysis of Fine–grain Multithreaded Multiprocessors
Instruction–level multithreading is an architectural approach to tolerating long–latency memory accesses and synchronization delays in distributed–memory systems. The paper presents a timed Petri net model of a fine–grain multithreaded distributed–memory multiprocessor system at the instruction execution level, and illustrates performance analysis by results obtained from simulation of the deri...
متن کاملDesign and Evaluation of Dynamic Load Balancing Schemes under a Fine-grain Multithreaded Execution Model
The evolution of computer systems based on fine-grain multithreaded program execution models introduces both unique opportunities and tough challenges for the support of dynamic load balancing. Although load balancing is an active research topic in the distributed computing field, there is still a lack of a detailed study of the different dynamic load balancing strategies under a fine-grain mul...
متن کاملComparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...
متن کاملJavaSplit: A Runtime for Execution of Monolithic Java Programs on Heterogeneous Collections of Commodity Workstations
This paper describes the design and presents the preliminary performance evaluation of JavaSplit, a portable runtime for distributed execution of multithreaded Java programs. JavaSplit transparently distributes threads and objects of an application among the participating nodes. Thus, it gains augmented computational power and increased memory capacity without modifying the Java multithreaded p...
متن کاملClassification and Performance Evaluation of Hybrid Dataflow Techniques With Respect to Matrix Multiplication
This paper classifies hybrid dataflow techniques due to the instruction issuing technique. A software simulation is conducted to compare fine-grain dataflow to several hybrid dataflow techniques: multithreaded dataflow with direct token recycling as used in Monsoon, multithreaded dataflow with consecutive execution of the instructions within a thread as used in the Epsilon processors and in EM-...
متن کامل